THE PROGRESS OF QUANTITATIVE METHODS IN 

PALAEONTOLOGY 

by J. T. TEMPLE 


Abstract. The slow progress of quantitative methods in palaeontology during the past decade threatens the 
survival of palaeontology. Different phenetic and cladistic methods produce different estimates of phylogenetic 
relations; none of these estimates is a priori more authoritative than others; taxonomic and phylogenetic 
certainty is unattainable, and probabilistic estimates of phylogeny must be accepted. Phenetic methods are well 
suited to estimating phylogenies from palaeontological data. Objective definitions of taxonomic entities and 
attribute states are essential in phylogenetic analysis. Outline analysis and landmark analysis are discussed, and 
the practical advantages of the former are considered to outweigh any resulting loss of homology. Techniques 
of outline and surface measurement and analysis are reviewed briefly. Temple’s (1982a) review of ordination 
methods is supplemented to include standardization of entities, Projection Pursuit, Detrended Correspondence 
Analysis and Canonical Correspondence Analysis. 

An earlier article (Temple 1982a) dealt with the use of ordination techniques in palaeontology. 
The present article, which may be considered a supplement to the earlier one, has a three-fold 
purpose: 

1. to review the progress of quantitative methods in palaeontology during the last decade, and 
to assess the auguries for the future; 

2. to consider quantitative palaeontology in the context of two current methodological debates, 
namely phenetics vs cladistics, and outlines vs landmarks; 

3. to up-date the earlier article on ordination. 

There can be no doubt that palaeontology, if it is to survive into the next century as a serious 
branch of science, will do so only in so far as it has transformed itself into a quantitative discipline. 
By its nature, palaeontology contributes primarily to the study of aspects of evolution - variation, 
speciation, organic diversity, the pattern of change in time - and the evidence that palaeontology 
brings to bear on these topics carries conviction with scientists in other disciplines only in so far as 
it is expressed in quantitative terms. To take two examples: polymodality of size or shape can be 
established only by analysis of frequencies of scores on suitable size or shape indicators; the patterns 
(or even the existence) of morphological change in fossil lineages or of secular change in organic 
diversity need to be tested against the models that can be proposed for such data, e.g. random walk 
with/without drift, trend-stationary process, etc. (Diggle 1990; Nelson and Plosser 1982; Mills 
1990). 

It must, however, be admitted that the auguries for the change in outlook by palaeontologists 
essential to the survival of our subject are not good. It is, after all, not many years since the 
Palaeontological Association gave its annual President’s Award to a paper, on assemblage structure 
of a group of fossils, whose author stated that he had used no numerical methods whatever in his 
analysis. A recent compendium of palaeontological techniques sponsored by the Association 
(Briggs and Crowther 1989) gives only the most scanty coverage to quantitative methods. The eyes 
of the palaeontological world, it seems, still glaze over at the mention of eigenvectors the word 
does not, for instance, appear in the index to Briggs and Crowther. It is difficult to be optimistic 
about the future when most of our colleagues have not yet learned even the vocabulary, let alone 
the practice, of quantitative palaeontology, after exposure to it during three decades. 

Furthermore, the enthusiasm with which cladistic methodology has been embraced by 
palaeontologists during the past decade (see below) is probably to be explained by the phenomenon 


(Palaeontology, Vol. 35, Part 2, 1992, pp. 475-484.| 


© The Palaeontological Association 


476 


PALAEONTOLOGY, VOLUME 35 


noted in a wider context by Felsenstein (1988tf, p. 113), namely that 'young, but traditionally- 
trained morphological systematists who tended, on the whole, to be uncomfortable with both 
numerical and molecular methods ... found Hennig’s qualitative discussion more accessible than the 
numerical work which had been slowly spreading for the previous decade’. 

Nevertheless, we should not despair, in spite of the disappointments of the past decade. The 
progress of technology in fields relevant to palaeontology (i.e. image and spatial analysis, data 
handling, etc.) is now so rapid, and the technology itself is so readily accessible, that notwithstanding 
its present doldrums palaeontology could still be revolutionized by the turn of the century - but 
only if we continue to provide the channels for the relevant technology to diffuse into our subject. 


PHENETICS VS CLAD I STIC S 

It seems that as far as most palaeontologists are concerned the phenetics vs cladistics debate has 
been resolved in favour of cladistics - whether for the reason suggested above or not. A mere 
handful of phenetic studies of fossil groups has been published, and even these few have been 
virtually ignored by other workers in the relevant fields. Phenetics is definitely not respectable in 
palaeontology, and word has come down from the mountain to this effect (Gould 1980, p. 110; see 
also Temple 1982 b). On the other hand, a glance at any recent issue of Palaeontology will reveal 
cladograms galore. In this respect, of course, our colleagues are by no means alone among 
taxonomists - see for instance the cladistic triumphalism of Ridley (1985, p. 8i) but contrast the 
more cautious rejection of phenetics by the same author the following year (Ridley 1986, pp. 83-85). 

In discussing the respective merits and demerits of cladistics and phenetics it is important to define 
the two terms accurately, and also to be clear about the purposes for which we are proposing to use 
these competing methodologies. As to our purposes for using cladistics or phenetics, I presume that 
nowadays we are not concerned primarily with classification as such, and certainly not with forcing 
palaeontological data into hierarchically nested Linnaean categories. Rather, we are trying to 
recover from our data a phylogenetic tree (not necessarily dichotomously branched) in which there 
is greater genetic interchange, and therefore in most cases greater phenotypic resemblance, along 
and within rather than between branches. Our task is therefore the same as that of the molecular 
biologist (Nei 1987, pp. 292 ff.; Felsenstein 1988 b). Indeed it is simpler than his, for whereas the 
molecular biologist attempts to reconstruct the whole tree (including hypothetical nodes, etc.) solely 
on the evidence of the tips of the branches, we start with a sample of the whole tree, including taxa 
at or near the nodes, a circumstance that simplifies enormously the reconstruction of the tree. 

When we come to define cladistics and phenetics our task is made difficult by the change that has 
overtaken the former word, from its broad original sense of the study of ancestor-descendant 
relations between taxa (Cain and Harrison 1960, p. 3; Sneath and Sokal 1973, p. 29) to its restriction 
to the particular form of cladistic analysis advocated by Hennig (1950, 1966); and also by the 
changes that Hennig’s original concept has itself undergone since 1966, and by the resulting debates 
between different schools of cladists (Ridley 1986, pp. 86-97), as fierce and bewildering to the 
outsider as the theological disputes of the fourth century a.d. For present purposes I adopt the 
following broad definitions of cladistics and phenetics: cladistics reconstructs phylogeny on the 
basis of change in attribute states between ancestral and descendant taxa whose attribute states are 
specified; phenetics does so on the basis of distances (variously defined) between taxa whose 
attribute states are not necessarily specified (and may be unknown). Since change in attribute states 
is equivalent to one form of distance (City-block or Manhattan), the difference between the 
methodologies on these definitions may appear trivial. There is, however, an underlying difference 
in philosophy between cladistics and phenetics. Cladistics is concerned with discrete attribute states, 
which are assumed to be independent (Swofford and Olsen 1990, p. 415), and of which the coding 
either reflects the presumed evolution of the attributes or is determined a posteriori by some criterion 
(e.g. Lipscomb 1990): continuously distributed variables (including statistical means etc.) are 
difficult to handle by cladistic techniques and lead to results that should be treated with caution 
(Chappill 1989, p. 231), while some cladistic authors consider them to be inherently unsuitable for 


TEMPLE: QUANTITATIVE METHODS 


477 


phylogenetic reconstruction (Pimentel and Riggins 1987; Farris 1990). Phenetics, on the other hand, 
accepts numerical data of any type (including continuously distributed, correlated variables), and 
requires attribute coding to be objective and to be completed prior to analysis. 

Hennig’s cladistic methodology (1950, 1966) postulated that the phylogeny of a group of taxa 
could be reconstructed by observing the distribution of the different (usually two) states of several 
morphological attributes, on the assumptions: (1) that the evolutionary sequence of states in each 
attribute is known; (2) that each change of state happens only at a single point in the phylogeny; 
and (3) that the sequence of states is irreversible. There is no doubt that the rigour which Hennig's 
methodology has brought to phylogenetic reconstruction has been beneficial, even if it is sometimes 
difficult to see the intellectual wood for the impenetrable undergrowth of jargon. The logic of 
determining phylogenies by shared derived attributes is unassailable, and it works very well at the 
naive level at which it is presented for didactic purposes, e.g. the relations between a small number 
of hypothetical entities (say, three or four species) based on six or seven attributes in each of which 
ancestral and derived states can be unequivocally distinguished and all of which suggest a unique 
phylogeny. The trouble is, of course, that the real world is not as tidy as this. The assignment of 
ancestral and derived states, although sometimes fairly straightforward and objective, as for 
chromosomal inversions, is in other cases difficult and controversial or, as for meristic and 
continuously distributed morphological attributes, inappropriate and simplistic; and in these cases 
it is disturbing that polarity needs to be incorporated into the analysis at an early stage whereas it 
would arise more logically (if at all) as output from the analysis itself. Furthermore, as the number 
of entities and attributes is increased, so there develops a conflict, due to violations of Hennig's 
assumptions (2) and (3), between the phylogenies suggested by different sets of attributes. This is 
the problem of homoplasy, or what Felsenstein (1982, p. 381) calls Hennig's dilemma, and Friday 
(1987, p. 66) non-divergent change. Recent studies of homoplasy quantify the problem and 
demonstrate convincingly the decline in the consistency index (an inverse measure of homoplasy) 
with increasing numbers of entities (Archie 1989; Sanderson and Donoghue 1989, text-fig. 1): the 
decline appears from Sanderson and Donoghue's data to be exponential, and the consistency falls 
to about 30% for 70 entities. In the face of this problem the pure logic of Hennig's methodology 
becomes hopelessly compromised by the need to make a subjective choice between the phylogenies 
supported by different sets of attributes. 

These problems in applying Hennig's original methodology have caused many cladists to 
abandon Hennig’s three basic assumptions, no longer assigning ancestral and derived attribute 
states a priori (palaeontologists, however, seem reluctant to take this step, except recently Adrain 
and Chatterton 1990), while acknowledging the extent of homoplasy and seeking the phylogeny that 
minimizes it. Indeed, some cladists (the so-called k pattern' or Transformed' cladists) no longer look 
upon their cladograms as having phylogenetic significance. 

The distinction between cladistics and phenetics has to some extent become blurred by these 
changes in cladistic methodology. There appears in fact to be a continuous spectrum of phylogenetic 
techniques (reviewed by Felsenstein 1982; Swofford and Olsen 1990), between overtly cladistic, i.e. 
based on phylogenetic changes between known or hypothesized attribute states of taxa (e.g. 
molecular sequence data, discrete morphological data), and pure phenetic, i.e. based wholly on 
directly observed distances without knowledge of the states that contribute to these distances (e.g. 
nucleic acid hybridization and immunological comparison data). On this cladistic-phenetic 
spectrum the phenetic methodologies that have been used in palaeontology (ordinations and 
dendrograms based on distances derived from attribute state lists) lie towards the phenetic end. Let 
us examine the two main arguments against using such methods: (1) in replacing the original data 
matrix by an inter-entity distance matrix, phenetic methods discard valuable information (Farris 
1981, p. 22; Penny 1982) ; and (2) despite the claims of its proponents, phenetic methodology is not 
objective (Ridley 1986, pp. 39 ff.). There is a third argument based on the non-metric properties of 
distances, which is of relevance mainly to molecular sequence data and the concept of the molecular 
clock: for the opposing views see Farris (1981, 1985, 1986) and Felsenstein (1984, 1986, 1988/?, pp. 
530-532). 
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The first argument against phenetics is a valid but not over-riding objection to those phenetic 
methods that rely entirely on secondarily-derived distance data both in processing and in 
presentation. This is true, for instance, of dendrograms, and adds weight to other objections to the 
use of dendrograms, especially the subjectivity involved in flattening the cylindrical structure of the 
dendrogram onto the printed page for presentation (Sneath and Sokal 1973, pp. 261 -264). The 
argument also applies to ordination techniques that depend entirely on a secondarily-derived 
distance matrix, i.e. the various types of multidimensional scaling. It does not, however, apply to 
ordination by Principal Components, which operates directly on the data matrix, and in which the 
original data can in principle be recovered from the transformed data that underlie the ordination. 
Furthermore, it is not an argument that can be used convincingly to discredit phenetics in favour 
of cladistics, because any loss in phenetic information must be seen in the context of the larger 
amount of information available to phenetic methods from their ability to handle continuously 
distributed attributes. 

The second argument appears to be two-fold: (1) phenetic methods are sensitive to different 
choices of distance coefficients and clustering techniques, and (2) the subjective choices forced on the 
pheneticist in this way vitiate the objectivity claimed for phenetics. The validity of part (1) of the 
argument must be acknowledged. Recent work (Temple unpublished) shows that the concordance 
(as assessed, for instance, by nearest-neighbour relations) between distance matrices based on 
different coefficients decreases as the number of attributes increases. Furthermore, different 
clustering or ordinating techniques clearly produce different results even when they are operating 
on the same distance matrix, let alone when operating on different distance matrices. Part (2) of the 
argument cannot, of course, be used selectively against phenetics, since it applies with equal force 
to cladistics because of the need to choose between compatibility and the different forms of 
parsimony for resolving homoplasy: it is a valid inference from (1) only if the pheneticist or cladist 
accepts the need to make the subjective choice postulated by the argument. It cannot be denied that 
in the past many pheneticists (including the present author) and cladists have done so, either 
explicitly or implicitly. The valid response of the pheneticist or cladist to this dilemma, however, is 
to recognize that different coefficients and techniques produce different estimates of the phylogenetic 
relations (i.e. different tree topologies); that none of these estimates is a priori more authoritative 
than others; but that probabilities can be assigned to the different nodes and branches according 
to the frequencies with which they recur in different estimates, and that in this way a probabilistic 
estimate of the tree can be obtained. 

In practice, of course, few pheneticists or cladists are likely to go to the lengths of trying all the 
available techniques on their data. It is, however, in the spirit of the last paragraph to present an 
ordination, minimal spanning tree, Wagner tree or dendrogram as no more than the result of 
applying a particular technique to the data, without claiming to have produced the definitive answer 
- and I suspect that many pheneticists and cladists do in fact have this attitude towards their results, 
even if it is not formally articulated. So long as the results are interpreted in this probabilistic spirit, 
phenetic methods - with their ability to handle meristic and continuous, correlated variables - are 
well suited to estimating phytogenies from palaeontological data-sets, in which significant inter¬ 
attribute correlations are known to occur (e.g. Temple and Tripp 1979, table 3; Temple 1980, table 
5). In particular, since palaeontological data are a sample of the whole tree (including nodes or near¬ 
nodes), an ordination in which each entity is linked by a minimal spanning tree to its nearest 
neighbour (however defined) would be expected to converge progressively to the true phytogeny as 
the density of sampling increases. Applications of this technique (Rowell 1970; Temple and Tripp 
1979; Temple 1980) might be criticized (although they do not in fact appear to have been so 
criticized) for implicit reliance on standardized Euclidean distance as the basis of phylogeny, but it 
is possible to make such ordinations overtly probabilistic by incorporating phylogenetic relations 
suggested by different distance measures (cf. Temple and Wu 1990, fig. 2). 

We conclude from this discussion that, because of limitations of samples and techniques and 
because of the widespread occurrence of homoplasy, certainty is not attainable in phylogeny and 
taxonomy, and whatever methods we use we must be content with probabilistic statements 
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(Felsenstein 1985; Penny and Hendy 1986; Sneath 1986). In this context the importance of 
Felsenstein’s conclusions cannot be too strongly emphasized: ‘The adoption of a methodology that 
explicitly acknowledges uncertainty is a paradoxical necessity if phylogenetic inference is to be 
placed on a firm scientific footing’ (Felsenstein 1982, p. 399). 

Before leaving this topic it is worth remarking that any taxonomic/phylogenetic analysis - 
phenetic, cladistic or whatever is only as good as the data on which it is based, and in particular 
that the objectivity of the results is limited by the objectivity of the original data. It is therefore of 
the utmost importance that taxonomic entities and attribute states should be defined objectively; 
and in both respects current palaeontological practice is lax. The most objective taxonomic entity in 
palaeontology is the topotype sample of a species (Temple and Tripp 1979, p. 234), and the most 
objective data are thus mean topotypic attribute states of species. In principle, analysis at higher 
taxonomic levels could be done either (1) by extending the type-concept vertically and representing 
any taxonomic level by the mean topotype attribute states of its type species; or (2) with some loss 
of objectivity, by grouping together topotype samples of designated species and calculating the 
relevant mean attribute states. In practice, neither of these procedures is normally followed, and 
many taxonomic/phylogenetic analyses in palaeontology are seriously weakened by being based on 
imprecisely and subjectively delimited ‘genera’, ‘families’, etc. As to attribute state definition we 
need only note that, if attribute states are not precisely and objectively defined, subsequent authors 
will be unable to repeat or extend the original observations, and not even the most rigorous analysis 
of such attributes could claim to be scientific. 

OUTLINES VS LANDMARKS 

The only valid objection that has ever been made to the use of quantitative methods in 
morphological palaeontology is that they are very time-consuming. This is, of course, no longer true 
of data-processing, but it is still true of observer-mediated data-gathering, i.e. measurement by an 
observer using callipers or eye-piece micrometer, and the problem is exacerbated by the need to 
process large samples in order to obtain statistically robust or significant results (cf. Temple 1987, 
p. 128). In these circumstances automated measurement is extremely desirable. 

The most convenient form of automated measurement is outline analysis, of which at least four 
types have been used in recent years in palaeontology and related subjects. The first and most 
extensively used method (e.g. Kaesler and Waters 1972; Healy-Williams 1983) has been polar 
Fourier analysis. In this method radii are drawn from the centroid of a closed curve (usually at equal 
angular intervals, say of 5°), and the length of each radius is plotted as a function of the angle of 
rotation of the radius (0-360°) from a zero starting direction. The resulting function is then Fourier- 
analysed as a sum of trigonometrical functions. Objections to this method are that it cannot be 
applied to complex curves with re-entrant angles, and that it depends on the identification of the 
centroid of the closed curve and on the definition of a zero point on the curve. The second type of 
outline analysis is elliptical Fourier analysis (Giardina and Kuhl 1977; Kuhl and Giardina 1982; 
Ferson et ai 1985). In this, a point travels around the closed curve at constant speed, and the .v and 
y coordinates are separately plotted as periodic functions of time and Fourier-analysed: the curve 
is then approximated by superimposing a series of orthogonal ellipses (in a manner analogous to 
the Ptolemaic approximation to the elliptic planetary orbits by superimposing cycles of circles). The 
third type of outline analysis is eigenshape analysis (Lohmann 1983). Here a point travels around 
the closed curve at equal increments of arc, and chords are drawn to it from a zero point on the 
curve. The angles that successive chords make with the tangent at the zero point form a vector 
characterizing the curve (Zahn and Roskies 1972), and vectors from different curves form a data 
matrix that is analysed by Principal Components. Both elliptical Fourier analysis and eigenshape 
analysis are free from the objections to polar Fourier analysis noted above: both can deal with 
complex curves; neither makes use of the extraneous concept of the centroid; the elliptical Fourier 
coefficients are independent of the zero point, while Lohmann (1983) avoids the problem by an 
algorithm for matching the vectors from different original curves. A fourth method, perimeter- 
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based Fourier analysis, has been introduced recently (Foote 1989). In this, a point travels around 
the closed curve at equal increments of arc as in eigenshape analysis, but chords are drawn to it from 
the centroid rather than from the zero point, the lengths and orientations of these chords being 
separately Fourier analysed. This method also can deal with complex curves, but has the 
disadvantage of depending on the centroid and a zero point. Of these various methods, elliptical 
Fourier analysis is the most attractive in principle and was found to perform well by Rohlf and 
Archie (1984). All four methods, however, are equally liable to the fundamental objections raised 
recently to the use of outline analysis in morphometries. 

These objections have been cogently stated by Bookstein et al. (1982). Briefly, they are that 
outline analysis obscures homologies. (Full and Ehrlich (1986) direct this criticism specifically at 
eigenshape analysis and by implication absolve Fourier analysis; but see Rohlf 1986; Ehrlich and 
Full 1986.) Bookstein et al. (1982, fig. 1) illustrate their argument by two circular outlines, one with 
four equally spaced homologous landmarks, the other with the same landmarks unequally spaced: 
the two outlines are indistinguishable in their Fourier coefficients (but see below) even though they 
represent two very different morphologies. The example is striking, but then so is the counter¬ 
example of Read and Lestrel (1986, fig. 2) - four equally spaced landmarks linked by two very 
different outlines indistinguishable by landmark analysis. 

There is no doubt that the concept of homology underpins the whole of comparative morphology, 
and that we should not set it aside lightly. It is not clear, however, that we should restrict our 
morphometries entirely to homologous landmarks, as has been done in a series of elegant papers 
by Bookstein and his collaborators (e.g. Bookstein et al. 1985; Bookstein 1986). Quite a lot of useful 
biometric information is derived from measurements that are not strictly homologous, e.g. the 
maximum width of a brachiopod or of the frontal glabellar lobe of a trilobite at different growth- 
stages, the maximum measurable length of a bivalve mollusc without terminal umbones. 
Furthermore, even in the example given by Bookstein et al. (1982, fig. 1) the four landmarks would 
be likely in practice to disturb the circular outline (as indeed they do in the figure), and if the outlines 
were sufficiently finely digitized this disturbance would produce differences in the Fourier 
coefficients between the two cases. Finally, returning to our original theme, there is no doubt that, 
although algorithms could presumably be devised ad hoc to recognize homologous landmarks 
without observer participation, in the present state of technology outline analysis lends itself much 
more readily to automated measurement than does landmark analysis. 

I conclude that the theoretical objections to outline analysis are not so great as to outweigh the 
practical advantages of automated measurement that the method offers. There are, however, some 
important precautions that palaeontological users of outline analysis should observe: 

1. standardizing and defining accurately the viewing orientation: this may be relatively easy 
where the morphology itself defines a plane of symmetry or at least something approximating to 
such a plane, as in coccoliths, ostracodes, many bivalve molluscs, profile views of brachiopods and 
trilobites, etc.; it becomes more difficult when we wish to view along (rather than normal to) a plane 
of symmetry, as in non-profile views of brachiopods and trilobite cephala (Temple 1972, 1970, pp. 
4—6); and it becomes a non-trivial problem, that cannot be solved purely by definition, when we 
wish to identify the axial view of a trochospiral foraminiferan; 

2. defining the reference zero point and/or (depending on the method used) zero direction on the 
outline; 

3. ensuring that size information is not inadvertently lost (by normalizing) early in the analysis: 
in Fourier methods size information is carried in the coefficients of the zeroth harmonic; 

4. using all the information derived from the analysis: in particular, in Fourier analysis the phase 
angles as well as the amplitudes carry information and should not be discarded as has happened in 
several palaeontological applications. 

Finally, it is important not to lose sight of the fact that most fossils are three-dimensional objects, 
of which two-dimensional outline analysis can give only an imperfect representation. Three- 
dimensional measurement and analysis are clearly desirable, and could in principle be done in 
several different ways, including: 
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1. characterization by 3 orthogonal profiles; 

2. direct, observer-mediated measurement of v, y and z coordinates (Lazarus 1986); 

3. several different techniques of automated metrology (Jarvis 1986; Gasvik 1987), of which only 
contouring by holographic interferometry seems yet to have been applied to palaeontology (Elliott 
and Morris 1987). 

Method (1) leads to direct parameterization in terms of 3 sets of Fourier coefficients. Methods (2) 
and (3) could lead to parameterization in terms of: (a) two-dimensional polynomial or Fourier 
series (Davis 1986, pp. 405-447) for surfaces equivalent to single-valued distances from a plane of 
symmetry (e.g. many ostracodes); or (b) spherical harmonics (Jacobs 1974, pp. 310-314; Bomford 
1980, pp. 782-787) for surfaces equivalent to single-valued radii from a centroid (e.g. many 
planktonic foraminifera, acritarchs); or (c) differential geometry (Okamoto 1988), or (d) 
computerized surface representation methods (Tipper 1979). 

ORDINATION TECHNIQUES 

A major omission in the earlier article (Temple 1982a) was a section on the standardization of 
entities. The need for such standardization arises from 'accidental’ differences between entities, 
particularly from differences in sample size between sites in distributional data. It is possible in some 
cases to remove the effects of such differences at a later stage of analysis, i.e. as the lirst eigenvector 
in Principal Components Analysis by analogy with the growth eigenvector of morphological 
analysis. The analogy with morphological analysis data is, however, not exact, for whereas a growth 
eigenvector is usually a direction of increase in all the attributes, the first eigenvector of 
distributional data may be determined fortuitously by the dominant species in the largest available 
sample; furthermore, size of an individual is an intrinsic attribute, whereas size of a fossil sample 
depends on extrinsic factors such as ease/difficulty of collection. For these reasons, although it is 
possible (indeed desirable) to retain size differences in growth analysis data, it is better for purposes 
of site ordination to standardize distributional data to a standard size of (say) 100. The constraint 
imposed on data by standardizing to constant sample size has, however, the undesirable effect of 
distorting the inter-attribute correlation matrix by inducing spurious negative correlations (this is 
the 'closure’ problem encountered in analysis of compositional data in petrology). Ordination of 
the attributes (i.e. interrelations of taxa) in distributional data should therefore be based on 
unstandardized data. 

Three ordination techniques have become available since the earlier article was prepared, namely. 
Projection Pursuit, Detrended Correspondence Analysis, and Canonical Correspondence Analysis. 

Projection Pursuit (Friedman 1987; Jones and Sibson 1987) is designed to seek out projections 
(i.e. combinations of attributes) that maximize the inhomogeneity of the data. Projection Pursuit 
does not yet appear to have been used on palaeontological data, but would be expected to be useful 
for testing whether morphological variation (size and/or shape) was continuous or discontinuous. 
As yet, however. Projection Pursuit can apparently handle only fairly small data matrices. 

Detrended Correspondence Analysis (Hill and Gauch 1980) is a technique designed to remove an 
artefact (the 'horseshoe’ effect) that may appear in ordinations of phyto-sociological data. When 
vegetation is sampled along an environmental gradient the gradually changing floral composition 
along the gradient would be expected to show up as a linear seriation of the sites along the first 
ordination axis, with sites at the environmental extremes furthest apart in the ordination; instead, 
there is a curvilinear relationship between site scores on the first two (or more) axes, so that the 
extreme sites are not at the extremes of the ordination. Digby and Kempton (1987, pp. 93-97 see 
especially tables 3.2, 3.11, and figs 3.14-3.15) give a very clear account of the phenomenon, which 
is attributable to underestimation of the distances between sites at opposite environmental extremes 
and ultimately to a special, quasi-diagonal, form of the data matrix. Palaeontological data matrices 
of this form arise in biostratigraphy (e.g. Rickards 1976, table 1), and ordinations of such matrices 
might therefore be expected to lead to horseshoes. This is true, for instance, of the MDSCAL 
ordination of two pollen sequences by Gordon and Birks (1974, p. 237, fig. 7), and in this case 
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detrending would probably make more perspicuous the stratigraphical correlation established by 
Gordon and Birks between the two sequences. The possibility therefore arises of correlating 
stratigraphical sequences by the scores of the individual horizons on the first axis of a Detrended 
Correspondence Analysis. 

Canonical Correspondence Analysis (ter Braak 1986) is a technique that produces a simultaneous 
three-fold ordination of sites at each of which have been observed the occurrences or abundances 
of various tax a and the values of various (environmental) variables : a detrending option is available. 
This powerful technique clearly has considerable potential in palaeoecological studies, but no 
applications appear to have been published so far. 

Finally, it may be useful to list some relevant general publications that have appeared in the last 
decade. There are good summaries of ordination techniques by Gordon (1981), Dunn and Everitt 
(1982), ter Braak (1987), and Digby and Kempton (1987), as well as in the new edition of Davis’s 
invaluable book (1986). Several examples of palaeontological ordination are given in Temple 
(1987). 
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